fix: unblock Archon when launched from a Claude Code terminal and on Bun 1.3+ by samuelcombey · Pull Request #1097 · coleam00/Archon

samuelcombey · 2026-04-11T16:57:25Z

Summary

Two blockers hit during a fresh setup on macOS, from inside a Claude Code terminal on Bun 1.3.12 — both reproduce deterministically and leave the orchestrator silently hung with no actionable error.

1. Claude Agent SDK subprocess refuses to launch (nested-session guard)

The parent Claude Code shell exports these markers:

`CLAUDECODE=1`
`CLAUDE_CODE_ENTRYPOINT`
`CLAUDE_CODE_EXECPATH`
`CLAUDE_CODE_HIDE_ACCOUNT_INFO`
`CLAUDE_CODE_NO_FLICKER`
`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS`

Archon's existing cleanup in `buildSubprocessEnv()` (`packages/core/src/clients/claude.ts:167`) only stripped `CLAUDECODE`. Worse, the Claude Agent SDK leaks `process.env` into the spawned child regardless of the explicit `env` option passed to `query()` — so even vars that `SUBPROCESS_ENV_ALLOWLIST` correctly excludes still reach the subprocess. The child then hits its own nested-session guard and the orchestrator hangs after logging `using_global_auth` twice, with no `query_failed`, no `subprocess_error`, no outbound connection to api.anthropic.com, and no visible child process.

The only bulletproof fix until the SDK stops leaking `process.env` is to delete the markers at the process level at server and CLI entry points. Auth vars (`CLAUDE_CODE_OAUTH_TOKEN`, `CLAUDE_CODE_USE_BEDROCK`, `CLAUDE_CODE_USE_VERTEX`) are kept — only the six nested-session markers are removed.

2. Telegram adapter dies on first polling error on Bun 1.3+

Telegraf's `redactToken()` (`node_modules/telegraf/lib/core/network/client.js:248`) does:

```js
error.message = error.message.replace(//(bot|user)(\d+):[^/]+//, '/$1$2:[REDACTED]/');
```

On Bun 1.3+, `Error.message` on certain error instances (fetch/network errors) is non-writable, so the assignment throws `TypeError: Attempted to assign to readonly property.`, kills the polling loop, and leaves the adapter silently dead. The single log line is:

```
telegram.start_failed TypeError: Attempted to assign to readonly property.
at redactToken (telegraf/lib/core/network/client.js:248:5)
```

Patched via `bun patch telegraf` to wrap the assignment in `try/catch`. The original error is still re-thrown unchanged, so Telegraf's own retry/reconnect logic runs normally; the only difference is that we survive one specific Bun runtime quirk.

3. Incidental: `bun.lock` re-sync

`bun patch --commit` re-serialized `bun.lock` and picked up workspace version bumps that were already in the individual `package.json` files (`0.3.5`) but stale in the lockfile (`0.1.0`/`0.2.0`). No behavioral change — just brings the lockfile in sync with reality.

Files

`packages/server/src/index.ts` — strip six markers right after dotenv loads, before any application imports
`packages/cli/src/cli.ts` — same strip at CLI entry
`patches/telegraf@4.16.3.patch` — `try/catch` around the `redactToken` assignment
`package.json` — `patchedDependencies` entry for the telegraf patch
`bun.lock` — pick up `patchedDependencies` + workspace version re-sync

Reproduction

From inside a Claude Code terminal on Bun 1.3+:

```bash

Before this PR

archon chat "say hi"

Hangs forever after "using_global_auth". No error.

After this PR

archon chat "say hi"

"Hello!" (or similar)

```

And for the Telegram bug: configure a bot token, start the server, run `curl https://api.telegram.org/bot/getUpdates` to trigger a 409 conflict in the long-poll — before this PR the adapter logs `telegram.start_failed` and stops polling forever; after, it logs the upstream error and reconnects.

Test plan

`bun run validate` passes (type-check, lint, format, tests)
`archon chat` works end-to-end
Telegram bot replies to DMs
Web UI chat replies
Verified CI passes on this branch

Related latent issues (not fixed in this PR)

`~/.archon/workspaces` is the orchestrator's fallback cwd for new conversations without a registered codebase (`orchestrator-agent.ts:743`). If the dir doesn't exist, Bun's `spawn()` hangs silently instead of erroring — another way to reproduce the exact same symptoms as Model stucked at response stream text #1 above. A follow-up PR should `mkdirSync(getArchonWorkspacesPath(), { recursive: true })` during server/CLI startup.
The `process.env` leak in the Claude Agent SDK (Model stucked at response stream text #1 root cause) should be reported upstream to `@anthropic-ai/claude-agent-sdk`.

Summary by CodeRabbit

Bug Fixes
- Fixed error message handling in network operations to work across different runtime environments.
Chores
- Enhanced environment variable cleanup during CLI and server startup to prevent context leakage to spawned subprocesses.
- Applied dependency patch to improve stability.

…Bun 1.3+ Two blockers hit during a fresh setup from inside a Claude Code session on Bun 1.3.12; both reproduce deterministically. 1) Claude Agent SDK subprocess refuses to launch (nested-session guard). The parent Claude Code shell exports CLAUDECODE, CLAUDE_CODE_ENTRYPOINT, CLAUDE_CODE_EXECPATH, CLAUDE_CODE_HIDE_ACCOUNT_INFO, CLAUDE_CODE_NO_FLICKER, and CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS. Archon's existing cleanup in buildSubprocessEnv() only stripped CLAUDECODE, and the SDK leaks process.env into the spawned child regardless of the explicit `env` option passed to query() — so the subprocess sees the remaining markers, hits its own nested-session guard, and the orchestrator hangs after `using_global_auth` with no error. Strip all six markers from process.env at server and CLI entry points so nothing can leak. Auth vars (OAUTH_TOKEN, USE_BEDROCK, USE_VERTEX) are kept. 2) Telegram adapter dies on first polling error on Bun 1.3+. Telegraf's redactToken() does `error.message = error.message.replace(...)` to sanitize bot tokens from error stacks. On Bun 1.3+, Error.message on fetch/network error instances is non-writable, so the assignment throws TypeError, kills the polling loop, and leaves the adapter silently dead with `telegram.start_failed`. Patch telegraf@4.16.3 via `bun patch` to wrap the assignment in try/catch; the original error is still re-thrown so Telegraf's own retry/reconnect logic runs normally. Also picks up an incidental bun.lock re-sync from `bun patch --commit`: the workspace package versions in the lockfile were stale (0.1.0/0.2.0) while the actual package.json files have been 0.3.5 for a while.

coderabbitai · 2026-04-11T16:57:39Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7eb5279-65aa-4218-9828-390bb246d6f8

📥 Commits

Reviewing files that changed from the base of the PR and between 536584d and f41f4e8.

⛔ Files ignored due to path filters (1)

bun.lock is excluded by !**/*.lock

📒 Files selected for processing (4)

package.json
packages/cli/src/cli.ts
packages/server/src/index.ts
patches/telegraf@4.16.3.patch

📝 Walkthrough

Walkthrough

This PR adds environment variable cleanup at startup to prevent nested-session context detection in spawned subprocesses, removes Claude Code markers (CLAUDECODE and related variables) from process.env in CLI and server entry points, and patches telegraf@4.16.3 to safely handle non-writable error messages in Bun environments.

Changes

Cohort / File(s)	Summary
Environment Variable Cleanup `packages/cli/src/cli.ts`, `packages/server/src/index.ts`	Deletes Claude Code nested-session environment markers (`CLAUDECODE`, `CLAUDE_CODE_ENTRYPOINT`, `CLAUDE_CODE_EXECPATH`, `CLAUDE_CODE_HIDE_ACCOUNT_INFO`, `CLAUDE_CODE_NO_FLICKER`, `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS`) from `process.env` at startup; preserves auth/config variables like `CLAUDE_USE_GLOBAL_AUTH`.
Telegraf Patch Setup `package.json`	Adds `patchedDependencies` entry to apply `patches/telegraf@4.16.3.patch` to telegraf@4.16.3.
Telegraf Token Redaction Robustness `patches/telegraf@4.16.3.patch`	Wraps `error.message` mutation in `redactToken()` with try/catch to handle non-writable error messages in Bun runtime; suppresses failure and rethrows original error if assignment fails.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

bug: Claude Agent SDK subprocess hangs when Archon launched from inside Claude Code session #1030: Deletion of CLAUDECODE and CLAUDE_CODE_* environment variables directly addresses the nested-session env-var detection that causes Claude Agent SDK bundled CLI to hang.
2 issues: v0.3.5: CLI workflow run silently hangs — dotenv loads .env from CWD instead of ~/.archon/.env,, + rchon serve hardcodes skipPlatformAdapters:true — Telegram/Discord/Slack adapters are unreachable #1067: Removal of CLAUDECODE and CLAUDE_CODE_* environment markers directly prevents nested Claude Code subprocess deadlocks.

Possibly related PRs

fix: remove CWD env stripping, load ~/.archon/.env with override for binary support #1045: Modifies environment-variable handling in the same files (packages/cli/src/cli.ts and packages/server/src/index.ts), adjusting auth loading and env-var handling strategies.

Poem

🐰 Away with the nested markers go,
CLAUDECODE and friends—out they flow!
Telegraf now safely redacts with care,
Through Bun's strict world without a snare.
try/catch in patches, env vars cleaned—
Smoother spawning than ever has been! 🌿

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: unblock Archon when launched from a Claude Code terminal and on Bun 1.3+' accurately summarizes the primary fixes: removing environment variable markers and patching Telegraf for Bun 1.3+ compatibility.
Description check	✅ Passed	The PR description is comprehensive and detailed, covering the two main blockers, root causes, fixes applied, reproduction steps, and test validation. It aligns well with template expectations despite not following the exact template sections.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f41f4e8917

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T17:00:40Z

+    // original error rather than killing the polling loop.
+    try {
+        error.message = error.message.replace(/\/(bot|user)(\d+):[^/]+\//, '/$1$2:[REDACTED]/');
+    } catch (_) { /* message is read-only; re-throw original */ }


Preserve token redaction on readonly message errors

Catching and ignoring failures in redactToken means Bun readonly-message errors are rethrown with the original unredacted text, which can include /bot<id>:<token>/ and leak the Telegram bot token into logs. In this exact Bun 1.3+ path, the new behavior fixes retries but drops the security guarantee this function exists to provide; please keep retry behavior while still throwing a redacted error payload.

Useful? React with 👍 / 👎.

…marker strip, tests 1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3) 2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK bypasses the env option and uses process.env directly (#1097) 3. Add CLAUDECODE=1 warning to server entry point (was only in CLI) 4. Add diagnostic payload content test for withFirstMessageTimeout 5. Integrate #1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS from process.env at entry point. Pattern-matched on CLAUDE_CODE_* prefix rather than hardcoding 6 names, so future Claude Code markers are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved. Root cause per #1097: the Claude Agent SDK leaks process.env into the spawned child regardless of the explicit env option, so the only way to prevent the nested-session deadlock is to delete the markers from process.env at the entry point. Validation: bun run validate passes, 125 paths tests (6 new marker tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak verified stripped (target repo .env DATABASE_URL does not affect Archon DB selection).

…only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see #1097) - It attracted a constant stream of PRs adding keys (#1060, #1093, #1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless.

Wirasm · 2026-04-12T09:07:06Z

Thanks for the excellent debugging on the nested Claude Code session issue — your finding that the SDK leaks process.env regardless of the env option was the key insight we were missing.

We've integrated your CLAUDECODE + CLAUDE_CODE_* marker stripping into #1092, with one difference: we pattern-match on CLAUDE_CODE_* (minus auth vars) rather than hardcoding 6 specific names, so future markers are handled automatically. The stripping runs inside stripCwdEnv() at entry point, covering both CLI and server.

The Telegraf Bun 1.3+ fix in this PR is still needed — #1092 doesn't address that. Would you be willing to resubmit a PR with just the Telegraf bun patch fix? Note that Cole also has #1066 open which replaces Telegraf with grammY entirely — you might want to coordinate with him on which approach lands.

Summary:

CLAUDECODE marker stripping → absorbed into fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout #1092 ✅
Telegraf readonly Error.message patch → still valuable, please resubmit separately

…t timeout (#1067, #1030, #1098, #1070) * fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout (#1067) Three bugs fixed: (1) Bun auto-loads CWD .env files before user code, leaking non-overlapping keys into the Archon process — new stripCwdEnv() boot import removes them before any module reads env. (2) archon serve hardcoded skipPlatformAdapters:true, preventing Slack/Telegram/Discord from starting. (3) Claude SDK query had no first-event timeout, causing silent 30-min hangs when the subprocess wedges — new withFirstMessageTimeout wrapper races the first event against a configurable deadline (default 60s). Changes: - Add @archon/paths/strip-cwd-env and strip-cwd-env-boot modules - Import boot module as first import in CLI entry point - Remove skipPlatformAdapters: true from serve.ts - Add withFirstMessageTimeout + diagnostics to ClaudeClient - Add CLAUDECODE=1 nested-session warning to CLI - Add 9 unit tests (6 strip-cwd-env + 3 timeout) Fixes #1067 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings for PR #1092 Fixed: - Clear setTimeout timer in withFirstMessageTimeout finally block (HIGH-1) - Add strip-cwd-env-boot to server/src/index.ts for direct dev:server path (MEDIUM-1) - Warn to stderr on non-ENOENT errors in stripCwdEnv (MEDIUM-2) - Update stale configuration.md docs for new env-loading mechanism (HIGH-2) - Add ARCHON_CLAUDE_FIRST_EVENT_TIMEOUT_MS and ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING env vars to docs - Add nested Claude Code hang troubleshooting entry - Fix boot module JSDoc: "CLI and server" → "CLI" only - Fix stripCwdEnv JSDoc: remove stale "override: true" reference - Update .claude/rules/cli.md startup behavior section - Update CLAUDE.md @archon/paths description with new exports Tests added: - Assert controller.signal.aborted on timeout - Handle generator that completes immediately without yielding - Strip distinct keys from different .env files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: replace string sentinel with typed error class in withFirstMessageTimeout Replace the '__timeout__' string sentinel used to identify timeout rejections with a dedicated FirstEventTimeoutError class. instanceof checks are more explicit and robust than string comparison on error messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address review findings — dotenv version, docs, server warning, marker strip, tests 1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3) 2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK bypasses the env option and uses process.env directly (#1097) 3. Add CLAUDECODE=1 warning to server entry point (was only in CLI) 4. Add diagnostic payload content test for withFirstMessageTimeout 5. Integrate #1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS from process.env at entry point. Pattern-matched on CLAUDE_CODE_* prefix rather than hardcoding 6 names, so future Claude Code markers are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved. Root cause per #1097: the Claude Agent SDK leaks process.env into the spawned child regardless of the explicit env option, so the only way to prevent the nested-session deadlock is to delete the markers from process.env at the entry point. Validation: bun run validate passes, 125 paths tests (6 new marker tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak verified stripped (target repo .env DATABASE_URL does not affect Archon DB selection). * refactor: remove SUBPROCESS_ENV_ALLOWLIST — trust user config, strip only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see #1097) - It attracted a constant stream of PRs adding keys (#1060, #1093, #1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless. * fix: restore override:true for archon env, add integration tests The integration tests caught a real issue: without override:true, the ~/.archon/.env load doesn't win over shell-inherited env vars. If the user's shell profile exports PORT=9999 and ~/.archon/.env has PORT=3000, the user expects Archon to use 3000. stripCwdEnv() handles CWD .env files (untrusted). override:true handles shell-inherited vars (trusted but less specific than ~/.archon/.env). Different concerns, both needed. Also adds 6 integration tests covering the full entry-point flow: 1. Global auth user with ANTHROPIC_API_KEY in CWD .env — stripped 2. OAuth token in archon env + random key in CWD — CWD stripped, archon kept 3. General leak test — nothing from CWD reaches subprocess 4. Same key in both CWD and archon — archon value wins 5. CLAUDECODE markers stripped even when not from CWD .env 6. CLAUDE_CODE_OAUTH_TOKEN survives marker strip * test: add DATABASE_URL leak scenarios to env integration tests * fix: move CLAUDECODE warning into stripCwdEnv, remove dead useGlobalAuth logic Review findings addressed: 1. CLAUDECODE warning was dead code — the boot import deleted CLAUDECODE from process.env before the warning check in cli.ts/server/index.ts could fire. Moved the warning into stripCwdEnv() itself, emitted BEFORE the deletion. Removed duplicate warning code from both entry points. 2. useGlobalAuth token stripping removed (intentional, not regression) — the old code stripped CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_API_KEY when useGlobalAuth=true. Per design discussion: the user controls ~/.archon/.env and all keys they set are intentional. If they want global auth, they just don't set tokens. Simplified buildSubprocessEnv to log auth mode for diagnostics only, no filtering. 3. Docs "no override needed" corrected — cli.md and configuration.md now reflect the actual code (override: true). --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>

Wirasm · 2026-04-13T11:02:39Z

Closing — this PR bundles two unrelated fixes:

Nested Claude Code marker stripping — superseded by security: prevent target repo .env from leaking into subprocesses (#1135) #1169 (merged into dev). stripCwdEnv() handles marker cleanup at boot, and executableArgs: ['--no-env-file'] prevents re-injection in the Claude subprocess. E2E verified from within a Claude Code session.
Telegraf readonly property crash on Bun 1.3+ — still a valid bug, tracked separately in Telegram adapter crashes on Bun: TypeError readonly property in Telegraf #1042 with an alternative approach in fix: replace Telegraf with grammY to fix Bun TypeError crash #1066 (replace Telegraf with grammY).

Thanks for identifying both issues — the marker stripping analysis directly informed the structural fix that landed.

…t timeout (coleam00#1067, coleam00#1030, coleam00#1098, coleam00#1070) * fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout (coleam00#1067) Three bugs fixed: (1) Bun auto-loads CWD .env files before user code, leaking non-overlapping keys into the Archon process — new stripCwdEnv() boot import removes them before any module reads env. (2) archon serve hardcoded skipPlatformAdapters:true, preventing Slack/Telegram/Discord from starting. (3) Claude SDK query had no first-event timeout, causing silent 30-min hangs when the subprocess wedges — new withFirstMessageTimeout wrapper races the first event against a configurable deadline (default 60s). Changes: - Add @archon/paths/strip-cwd-env and strip-cwd-env-boot modules - Import boot module as first import in CLI entry point - Remove skipPlatformAdapters: true from serve.ts - Add withFirstMessageTimeout + diagnostics to ClaudeClient - Add CLAUDECODE=1 nested-session warning to CLI - Add 9 unit tests (6 strip-cwd-env + 3 timeout) Fixes coleam00#1067 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings for PR coleam00#1092 Fixed: - Clear setTimeout timer in withFirstMessageTimeout finally block (HIGH-1) - Add strip-cwd-env-boot to server/src/index.ts for direct dev:server path (MEDIUM-1) - Warn to stderr on non-ENOENT errors in stripCwdEnv (MEDIUM-2) - Update stale configuration.md docs for new env-loading mechanism (HIGH-2) - Add ARCHON_CLAUDE_FIRST_EVENT_TIMEOUT_MS and ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING env vars to docs - Add nested Claude Code hang troubleshooting entry - Fix boot module JSDoc: "CLI and server" → "CLI" only - Fix stripCwdEnv JSDoc: remove stale "override: true" reference - Update .claude/rules/cli.md startup behavior section - Update CLAUDE.md @archon/paths description with new exports Tests added: - Assert controller.signal.aborted on timeout - Handle generator that completes immediately without yielding - Strip distinct keys from different .env files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: replace string sentinel with typed error class in withFirstMessageTimeout Replace the '__timeout__' string sentinel used to identify timeout rejections with a dedicated FirstEventTimeoutError class. instanceof checks are more explicit and robust than string comparison on error messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address review findings — dotenv version, docs, server warning, marker strip, tests 1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3) 2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK bypasses the env option and uses process.env directly (coleam00#1097) 3. Add CLAUDECODE=1 warning to server entry point (was only in CLI) 4. Add diagnostic payload content test for withFirstMessageTimeout 5. Integrate coleam00#1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS from process.env at entry point. Pattern-matched on CLAUDE_CODE_* prefix rather than hardcoding 6 names, so future Claude Code markers are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved. Root cause per coleam00#1097: the Claude Agent SDK leaks process.env into the spawned child regardless of the explicit env option, so the only way to prevent the nested-session deadlock is to delete the markers from process.env at the entry point. Validation: bun run validate passes, 125 paths tests (6 new marker tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak verified stripped (target repo .env DATABASE_URL does not affect Archon DB selection). * refactor: remove SUBPROCESS_ENV_ALLOWLIST — trust user config, strip only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see coleam00#1097) - It attracted a constant stream of PRs adding keys (coleam00#1060, coleam00#1093, coleam00#1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless. * fix: restore override:true for archon env, add integration tests The integration tests caught a real issue: without override:true, the ~/.archon/.env load doesn't win over shell-inherited env vars. If the user's shell profile exports PORT=9999 and ~/.archon/.env has PORT=3000, the user expects Archon to use 3000. stripCwdEnv() handles CWD .env files (untrusted). override:true handles shell-inherited vars (trusted but less specific than ~/.archon/.env). Different concerns, both needed. Also adds 6 integration tests covering the full entry-point flow: 1. Global auth user with ANTHROPIC_API_KEY in CWD .env — stripped 2. OAuth token in archon env + random key in CWD — CWD stripped, archon kept 3. General leak test — nothing from CWD reaches subprocess 4. Same key in both CWD and archon — archon value wins 5. CLAUDECODE markers stripped even when not from CWD .env 6. CLAUDE_CODE_OAUTH_TOKEN survives marker strip * test: add DATABASE_URL leak scenarios to env integration tests * fix: move CLAUDECODE warning into stripCwdEnv, remove dead useGlobalAuth logic Review findings addressed: 1. CLAUDECODE warning was dead code — the boot import deleted CLAUDECODE from process.env before the warning check in cli.ts/server/index.ts could fire. Moved the warning into stripCwdEnv() itself, emitted BEFORE the deletion. Removed duplicate warning code from both entry points. 2. useGlobalAuth token stripping removed (intentional, not regression) — the old code stripped CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_API_KEY when useGlobalAuth=true. Per design discussion: the user controls ~/.archon/.env and all keys they set are intentional. If they want global auth, they just don't set tokens. Simplified buildSubprocessEnv to log auth mode for diagnostics only, no filtering. 3. Docs "no override needed" corrected — cli.md and configuration.md now reflect the actual code (override: true). --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>

chatgpt-codex-connector Bot reviewed Apr 11, 2026

View reviewed changes

Wirasm mentioned this pull request Apr 12, 2026

fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout #1092

Merged

Wirasm mentioned this pull request Apr 12, 2026

bug: Claude Agent SDK subprocess hangs when Archon launched from inside Claude Code session #1030

Closed

Wirasm closed this Apr 13, 2026

Wirasm mentioned this pull request Apr 20, 2026

feat(paths/cli/setup): unify env load + write on three-path model (#1302, #1303) #1304

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: unblock Archon when launched from a Claude Code terminal and on Bun 1.3+#1097

fix: unblock Archon when launched from a Claude Code terminal and on Bun 1.3+#1097
samuelcombey wants to merge 1 commit intocoleam00:devfrom
samuelcombey:fix/claude-sdk-env-and-telegraf-bun-compat

samuelcombey commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 11, 2026

Uh oh!

Wirasm commented Apr 12, 2026

Uh oh!

Wirasm commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

samuelcombey commented Apr 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Claude Agent SDK subprocess refuses to launch (nested-session guard)

2. Telegram adapter dies on first polling error on Bun 1.3+

3. Incidental: `bun.lock` re-sync

Files

Reproduction

Before this PR

Hangs forever after "using_global_auth". No error.

After this PR

"Hello!" (or similar)

Test plan

Related latent issues (not fixed in this PR)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Wirasm commented Apr 12, 2026

Uh oh!

Wirasm commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

samuelcombey commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading